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ABSTRACT 

A comprehensive study of the measurement of star formation histories from colour- 
magnitude diagrams (CMDs) is presented, with an emphasis on a variety of subtle 
issues involved in the generation of model CMDs and maximum likelihood solution. 
Among these are the need for a complete sampling of the synthetic CMD, the use of 
of proper statistics for dealing with Poisson-distributed data (and a demonstration of 
why x 2 must not be used), measuring full uncertainties in all reported parameters, 
quantifying the goodness-of-fit, and questions of binning the CMD and incorporating 
outside information. Several example star formation history measurements are given. 
Two examples involve synthetic data, in which the input and recovered parameters 
can be compared to locate possible flaws in the methodology (none were apparent) 
and measure the accuracy with which ages, metallicities, and star formation rates can 
be recovered. Solutions of the histories of seven Galactic dwarf spheroidal companions 
(Carina, Draco, Leo I, Leo II, Sagittarius, Sculptor, and Ursa Minor) illustrate the 
ability to measure star formation histories given a variety conditions - numbers of 
stars, complexity of star formation history, and amount of foreground contamination. 
Significant measurements of ancient > 8 Gyr star formation are made in all seven 
galaxies. Sculptor, Draco, and Ursa Minor appear entirely ancient, while the other 
systems show varying amounts of younger stars. 

Key words: galaxies: stellar content - Local Group - methods: numerical - methods: 
statistical 


1 INTRODUCTION 

The evolution of galaxies can be studied two ways - one can 
either look at high redsliift to observe the past directly, or 
one can look at the fossil remains of past events in nearby 
galaxies. These approaches are complementary, as the first 
is more direct (the ancient light is being observed now) but 
allows only a statistical comparison of the events seen hap¬ 
pening at different ages (we cannot be entirely sure which 
systems at high redshift are analogous to which systems in 
the nearby universe). In contrast the measurement of the 
star formation history of a nearby galaxy (one whose stellar 
content is resolved) allows one to trace the history of a sin¬ 
gle system, but it is difficult to determine that history in an 
unambiguous way. 

The measurement of star formation histories via com¬ 
parisons of observed and synthetic colour-magnitude dia¬ 
grams (CMDs) is an active field that is evolving rapidly. The 
first papers on the topic arrived in the literature only slightly 
more than a decade ago, with Tosi et al. (1989) and Bertelli 
et al. (1992) two early attempts to derive the star formation 


histories of composite populations (stars of a range of ages 
and metallicities). As opposed to isochrone fitting in single¬ 
population objects, such as globular clusters, the measure¬ 
ment of a star formation history of a composite system is 
a daunting task - SFR(t,Z), distance, extinction/reddening, 
initial mass function (IMF), and binary distribution are all 
unknowns at some level; while the comparison of CMDs was 
done subjectively. In order to cope with the vast combina¬ 
tions of parameters possible given limitations in computer 
speed and the number of subjective comparisons that could 
be made, these early studies limited the parameter space, 
generally measuring only a small set of SFR(t) functions 
and assuming fixed values for all other parameters. 

The work of Gallart et al. (1996a), studying the old stel¬ 
lar content of NGC 6822, was the first attempt to quantify 
the subjective CMD comparisons. In that work, the authors 
constructed a large number of parameters, each of which 
measured the position, size, and/or number of stars of a 
certain feature of the CMD. This allowed the first quantita¬ 
tive judgment of a star formation history, although both the 
procedures for generating synthetic CMDs and comparing 
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CMDs were still extremely slow, forcing a solution of only 
SFR(t) and Z(t). 

The shift to a fully quantitative analysis was proposed 
independently by Dolphin (1997) and Aparicio, Gallart, & 
Bertelli (1997), who proposed binning the CMD into sec¬ 
tions and performing a y 2 minimization on the number of 
stars in each section to determine the star formation his¬ 
tory. Dolphin (1997) demonstrated the sensitivity of such 
a method to metallicity, distance, and extinction, and as 
well as its ability to correctly reconstruct the star forma¬ 
tion history of a synthetic population; Aparicio et al. (1997) 
applied a remarkably similar algorithm to a study of LGS 
3. The advantage in such a technique lay in its ability to 
use all parts of the CMD in measuring the star formation 
history - thus allowing it to be used on photometry of any 
quality and depth - as well as the obvious advantage of hav¬ 
ing a single-parameter fit that can be used in a numerical 
minimization. 

The number of groups working on this topic continues 
to increase; Tolstoy & Saha (1996); Holtzman et al. (1999); 
Olsen (1999); Hernandez, Gilmore, & Valls-Gabaud (2000); 
and Harris & Zaritsky (2001) is only a partial list of other 
groups that are using CMDs to measure past star formation 
histories. These techniques have been applied to many of 
the Local Group galaxies, as well as a few galaxies just out¬ 
side the Local Group. Despite the large number of papers 
on this topic, the literature lacks thorough methodology pa¬ 
pers describing modern techniques, largely because of the 
incremental improvements in methods that have been im¬ 
plemented by each group. An example of this is the series 
of papers by Dolphin Dolphin (1997) presenting the initial 
method; Dolphin (2000a), Dolphin et al. (2001a), Miller et 
al. (2001), and Dolphin (2001) each containing minor im¬ 
provements to the technique - which forces the reader to 
follow a paper trail to determine what any one group is cur¬ 
rently doing. Another significant void in the literature is a 
realistic estimation of how well one can measure star for¬ 
mation histories under a variety of conditions - number of 
stars in the CMD, amount of foreground contamination, and 
complexity of star formation. 

The present work attempts to fill these needs in the lit¬ 
erature, in addition to addressing commonly-made mistakes. 
This paper is divided into two main sections - a detailed de¬ 
scription of how to measure star formation histories from 
CMDs and application to artificial and real data. 


2 ANALYSIS PROCEDURE 

In any measurement of the star formation history, one funda¬ 
mental question must be addressed: what set of star forma¬ 
tion histories could have created the observations? In order 
to answer this question, the following steps must be taken: 

• Generate synthetic CMD based on theoretical 
isochrones 

• Account for incompleteness and observational errors 

• Measure the best star formation history and its quality 

• Measure the allowable range of other star formation his¬ 
tories 

Each of these steps will be addressed in the sections below. 


2.1 Synthetic CMD Generation 

The generation of synthetic CMDs, as described in this pa¬ 
per, is a two-step process - generation of “clean” CMDs and 
the introduction of incompleteness and observational errors. 
(The reason for the split is merely a computational one. 
Generation of the clean CMDs is the most time-consuming 
part of the entire measurement, but only needs to be done 
once; application of observational errors will be different at 
different assumed distance and extinction values.) The end 
result of this process is intended to be a model CMD - the 
probability distribution from which the observed data are 
drawn. The question of binning the CM D vs . storing indi¬ 
vidual stars will be addressed in section |2.3[ For the time 
being, we will simply assume that the CMD is to be binned. 

As was pointed out by Dolphin (1997), a CMD of a 
composite population is simply the sum of the CMDs of 
its constituent parts. Thus, for any given distance, extinc¬ 
tion, IMF, and binary distribution, the CMD corresponding 
to any SFR(t,Z) can be computed as the sum of its parts. 
(If one wishes to solve also for distance or any of the other 
“fixed” parameters, separate solutions must be made at each 
combination of fixed parameters) This makes it unneces¬ 
sary to spend the vast computational resources used in early 
studies, as the “partial CMDs” - model CMDs containing 
small ranges in age and metallicity - need to be computed 
only once. If one computes each partial CMD with the same 
star formation rate, such as 1 Mgj/r -1 , the model CMD for 
an arbitrary star formation history is given by 

mi =y^r :j Cij. ( 1 ) 

j 

where rm is the full model CMD in bin i, rj is the star for¬ 
mation rate for partial CMD j in Mgyr -1 , and aj is bin 
i of partial CMD j. This relation makes the computational 
problem much easier, but determining ctj is nevertheless a 
non-trivial procedure. The usual procedure for this process is 
to randomly populate each partial CMD with a large number 
of stars randomly drawn from the age and metallicity range, 
and an adopted IMF. Although an attractive algorithm for 
its simplicity, it is impossible in practice to adequately sam¬ 
ple the model CMD this way as the density of points on the 
CMD varies by many orders of magnitude between the lower 
main sequence and the Hertzsprung gap. Additionally, such 
a “random drawing” routine inevitably adds random errors 
to the CMD, thus making the CMD comparison one of data- 
data rather than data-model. Since we can determine the 
underlying model via the process described here, there is no 
need to apply a statistically-weaker data-data comparison 
that assumes no knowledge of the model. 

Thus we seek to find an algorithm that will generate 
a true model CMD. In order to accomplish this, one must 
completely sample all possible combinations of mass (includ¬ 
ing secondary mass in unresolved binary systems), metallic¬ 
ity and age comprising the partial CMD. The process will 
thus first involve calculating a sufficiently large number of 
isochrones, so that the space between adjacent isochrones is 
much smaller than the CMD bin size. For example, the Gi- 
rardi et al. (2000) isochrones with {Z, logf) of (0.001,8.50) 
and (0.001, 8.55) have a maximum separation of AV = 0.45 
magnitudes and A (V — I) = 0.10. Even if using a coarse bin¬ 
ning size of 0.1 magnitudes in V, this would require small 
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steps of approximately Alogf = 0.005. The isochrones at 
(0.001, 8.50) and (0.004, 8.50) have maximum separations of 
AV = 1.04 and A(C — I) = 0.71, thus requiring small steps 
in metallicity (~ 0.02 dex) to adequately fill in the CMD. 
These values are only samples; the actual step sizes used in 
model generation depend on the spacing between isochrones 
and the CMD binning size. It should be noted that one can¬ 
not hope to actually measure ages and metallicities at this 
level of precision; however the presence of those isochrones 
is necessary in order to provide a complete model CMD. It 
is clear that an interpolation scheme is mandatory in this 
process; the details of this are beyond the scope of this pa¬ 
per. 

Once the set of needed isochrones has been established, 
each isochrone is then considered in turn and divided into 
an appropriate number of points. Again, the stepsize is a 
function of the CMD binning size. Each point is weighted 
by the IMF, the mass difference between it and the adjacent 
points, and the stepsizes in age and metallicity. Binaries are 
also added at this point; again it is necessary to sample the 
range of secondary masses to at least the accuracy of the 
binned CMD. 

The result of this process can be thought of as a 
“blurred isochrone”, centred roughly on the central age and 
metallicity used for the partial CMD. What is critical is 
that all possible masses, metallicities, ages, and binary com¬ 
binations within this range are accounted for in the binned 
CMD; this goal can only be achieved by the procedure out¬ 
lined above. 


2.2 Simulating Observational Conditions 


Of course, an observed CMD is never purely a pure 
isochrone; photometric errors, blending, incompleteness, 
bad/false detections, and foreground contamination all com¬ 
plicate matters. In generating an accurate model CMD, all 
of these factors must be taken into account. As was pointed 
out by Gallart et al. (1996b), the problems of photomet¬ 
ric errors, blending, and incompleteness can be addressed in 
one step through the use of a library of artificial star tests. 
Indeed, this is the only accurate way of addressing these 
problems, as incompleteness is a function of the observed 
magnitude of a star rather than its true magnitude, blend¬ 
ing errors depend on the density of stars and the relative 
distributions within the CMD, and even simple photometric 
errors are biased and non-Gaussian. 

Thus the necessary procedure for simulating the first 
three observational effects is to generate a very large library 
of artificial stars, which includes the necessary range of input 
magnitudes and colours with a sufficiently large number of 
stars input at each location on the CMD so that the distri¬ 
bution of recovered photometry is adequately sampled. For 
each point on the partial CMD created in section 2.1, it is 


necessary to multiply its weight by the completeness frac¬ 
tion and distribute that weight according to the distribution 
of recovered artificial stars. 

It is also possible to correct for foreground contamina¬ 
tion in a consistent manner. Again it is necessary to realize 
that foreground stars in the observed data are randomly 
drawn from an intrinsic distribution in the same manner as 
the object stars. Thus they can be modeled in the same man¬ 
ner as the object stars - by constructing a model foreground 
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CMD. This is usually not done with isochrones; the common 
procedure is instead to observe a second field nearby (in 
terms of Galactic coordinates) the object field but well be¬ 
yond the limits of the object being studied. A small amount 
of smoothing of the foreground CMD is generally necessary, 
and the resulting CMD can be added to the partial CMDs 
to create a better representation of the model from which 
the observed data are drawn. This procedure is clearly su¬ 
perior to the more commonly-used “statistical subtraction”, 
as the subtraction process inevitably leaves residuals (over¬ 
subtraction and undersubtraction) and thus a CMD that is 
not representative of the underlying sta r formation history. 
It will be demonstrated in Section 4.6 that, when treated 


properly, foreground contamination can be dealt with eas¬ 
ily- 


A final problem is the presence of bad points - short- 
period variables, stars hit by cosmic rays in one image, bad 
pixels, etc. - that cannot be modeled either by the partial 
CMDs or by a foreground CMD. Again, it is necessary to 
attempt to create the underlying model distribution from 
which these “objects” are drawn. This model consists largely 
of two distributions. Purely artificial objects, such as cosmic 
rays and bad pixels, are likely to be spread anywhere on the 
observed CMD, and should be fit with a flat distribution. 
Short-period variables and stars plus cosmic rays will likely 
fall near the observed distribution of points, and should be 
modeled by smoothing either the observed or model CMD 
(usually the observed CMD). As with the foreground CMD, 
the “bad point” CMD is to be added to the partial CMDs 
when determining the model CMD for a star formation his¬ 
tory. Thus equation ji] becomes 


mi = ^ Tjdj + fi + bpt, 


( 2 ) 


where /; is bin i of the foreground CMD and bpt is bin i of 
the combined bad point CMD. 


2.3 Comparison of Model CMD with Data 


When faced with the task of fitting data to a model, most 
scientists tend to first think of using \ 2 - As X 2 is simply 
related to the differences between data and model and the 
predicted la errors, there is a certain intuitive attractiveness 
to this approach. Less appreciated, though, is the fact that 
minimizing y; 2 is actually a maximum-likelihood calculation 
for the case of data with Gaussian errors and known uncer¬ 
tainties at each point. This can be demonstrated trivially as 
follows. Pi denotes the probability that the observation n is 
drawn from model m, m.i is the model value of bin i, m is 
observed value of bin i, and Oi is the uncertainty of bin i. 



( 3 ) 


We can define a “Gaussian likelihood ratio” as the proba¬ 
bility that observed data point m was drawn from a model 
equal to m; divided by the probability that it was drawn 
from a model equal to m. (This is equivalent to the term 
“relative probability” used by Tolstoy & Saha 1996). 


GLRi 



—mP) Itj 

1 ' ' 77 


( 4 ) 
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where a m i equals the expected uncertainty with model rrii 
and o n i is that for model m. Multiplying the individual 
Gaussian likelihood ratios and taking the logarithm, we ob¬ 
tain 


-2 In GLR 



(m — rm ) 2 


a 


2 > 

mi 


(5) 


or simply 

-21nGLf? = x 2 + ^ln^fi. (6) 


Thus, if the observational error distribution is a smooth 
Gaussian, and if the cq values do not change during the fit 
(in which case ln(cr;%/< t)%) = 0), minimizing x 2 is the equiv¬ 
alent of finding the model most likely to have produced the 
observations. However, neither of these assumptions is true 
in CMD analysis - the data follow a Poisson distribution, 
and = mi while aP = m. 

The danger in using x 2 to minimize Poisson-distributed 
data is that the determined “solution” will not actually be 
the correct solution. Examples are given by Mighcll (1999); 
the reader can easily verify this fact by populating an array 
with, on average, 1 point per bin and minimizing x 2 to find 
the mean. Depending on the formulation of \ 2 used, one’s 
“fit” will be incorrect by up to 42%. Mighell (1999) proposes 
an alternative statistic (x 2 ) that will minimize properly, but 
a better solution can be found by deriving a statistic based 
on a Poisson, rather than Gaussian, probability function. 

Instead of using a x 2 fit, with its implicit assumptions 
of the data, one should instead use a maximum likelihood 
parameter based on the Poisson probability distribution 


Pi = 


i rn\ 


(7) 


The “Poisson likelihood ratio” is analogous to the Gaussian 
likelihood ratio (y 2 ) in equation [ij. Cancelling the m\ terms 
in numerator and denominator, we have 


PLRi 


( 8 ) 


the ratio of the probability of drawing m points from model 
mi to that of drawing rii points from model n;. The cumu¬ 
lative likelihood ratio is given by 


PLR = 



(9) 


and the Poisson equivalent of x 2 is 


E m 

mi — m m In — 

777,4 


( 10 ) 


An examination of this parameter indicates that it shares 
many of the same features as x 2 > namely that it is zero 
when rii = mi and that the expectation value and vari¬ 
ance are 1 and 2, respectively, at large values of mi. Ad¬ 
ditionally, minimizing this parameter is truly a maximum 
likelihood calculation, and applying this parameter to the 
example given above will result in a correct determination 
of the mean. Thus, given the presence of a Poisson-based 
statistic that can be minimized in the same manner as x ,2 > 
there is no good reason to use x 2 to fit a CMD, as x 2 wifi 
always minimize to the wrong solution. 


Before turning our attention to more general aspects of 
finding the best fit, we need to address a pair of statistics. 
First is the Saha W statistic (Saha 1998) 


Wi = 


(mi + m)\ 
milml 


( 11 ) 


As noted by Saha (1998), this parameter is proportional to 
the probability that observed data sets mi and rii are drawn 
from the same model distribution, without any knowledge of 
what that model is. It is therefore a data-data comparison 
and cannot be used for the model-data comparison we wish 
to perform. (The fact that one is taking a factorial of a non¬ 
integer mi is the first indication that it is unsuitable for 
such a task.) Incidentally, a related statistic can be used for 
comparison of data with a randomly-drawn synthetic CMD, 
though it is significantly more complex than the Poisson 
likelihood ratio. Rather than determining the likelihood that 
two data sets are random realizations the same model (the 
basis of the W statistic), one instead measures the likelihood 
that the observed data are a random realization of some 
linear combination of the models from which the synthetic 
CMDs are drawn. This probability is given by 


p -n< 


1 I; s ':i' J mil= o 



^.(C J+ 1 )m„ ^2 c . m ..) n * J [(/)/”;'!)}. (12) 

j j 


where rii is the number of observed points in CMD bin i, Sij 
is the number of synthetic points in CMD bin i of partial 
CMD j, mij is the underlying model in CMD bin i of partial 
CMD j, and Cj is the star formation rate corresponding to 
partial CMD j. Substituting Xij = mij(cj + 1) and yj = 
Cj/( c j + 1), this simplifies slightly to 


r=n< 


1 I I - j/J-Q 
ni'-YljSiP 



1(52 I] {e-***x%dxij)]}. (13) 


Expanding the first sum inside the integral to a polynomial, 
the integral is reduced to a set of gamma functions, which 
can be easily solved. 

The final statistical treatment to be considered is the 
Bayesian inference scheme proposed by Tolstoy & Saha 
(1996), which is usually seen as a “bin-free” statistic. How¬ 
ever, if the CMD binning grid is sufficiently fine, so that m; 
adequately describes the density of model points everywhere 
within the bin (in other words, the binning size is smaller 
than the CMD features), then their probability of measuring 
a point in CMD bin i becomes merely the fraction of model 
points that are in that CMD bin, or 



(14) 


In this equation Pi is the probability of an observed point 
falling in CMD bin i, and mi is (as before) the number 
of model points in CMD bin i. This formulation is a slight 
improvement over their equation 11, as it allows for the more 
accurate treatment of observational errors described in 2.2 


instead of the unbiased Gaussian errors assumed by Tolstoy 
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& Saha (1996). The cumulative probability of drawing the 
entire observed data set n; is thus given by 


P = 



(15) 


which produces 


— 2 In P = 'y ' m In 


rrn 


E, 


(16) 


Aside from factors of the model normalization (V) rrii), equa¬ 
tion [lij will minimize identically to equation |lo| above - the 
only difference is that equation ^ throws away the overall 
star formation rate information and thus returns only rel¬ 
ative star formation rates, while equation [To] does not. As 
there is nothing to be gained by using equation |l6| instead 
of equation |I(|, we will not discuss the Bayesian inference 
scheme further. It should be noted, however, that use of 
Bayesian inference (as formalized in equation §) is not an 
“error” in the sense that a x 2 fit is wrong; it simply is of 
less value than the Poisson likelihood ratio. 


2.4 Determination of the Best Fit 


Something that is frequently considered in minimization so¬ 
lutions is the particular algorithm used to determine the best 
fit. Specifically, genetic and annealing algorithms are com¬ 
monly applied because these are less likely to be trapped in 
local minima. However, it is worth considering whether this 
is actually a valid concern. Given any arbitrary star forma¬ 
tion history rj that produces a model CMD given by rrii , 
the Poisson likelihood ratio will be given by 

= 2 ££ rjdj + fi+ bpi] - m 

i 3 


+ rn In 


rii 

Ej + fi+ bpi ' 


(17) 


If dvj is a small vector in the direction of the best fit, the 
Poisson likelihood ratio at r,- + dvj is given by 


fit(rj + dvj) = 2 £'['£(»":7 + dvj)aj + fi + bpi} — m 


+ rii In 


V .(rj + dvj)ci,j +fi + bpi ' 


(18) 


The change in the likelihood ratio by introducing dvj is 


! £{£ dv j°id] ~ n i inf 1 + - 1 

^ ^ V,; r .i r '. 


E, dvjCij 


j + fi + bp 


■]}■ (19) 


Since the vector dvj is arbitrarily small, we can approximate 
ln(l + x ) with x, producing 

A/it = 2 £{[£d«jCi,j](l- E-)}. (20) 

zz Tibi 
* 3 

Since any movement toward the best fit will lower the model 
CMD where it is currently overestimated (^ . dvjCi,j < 0 
where m/mi < 1) and raise it where it is underestimated 
(Ej dvjCi,j > 0 where rn/rrii > 1), A fit will always be neg¬ 
ative when moving in the right direction. Thus there are no 
local minima in the solution for SFR(t,Z), and any reason¬ 
able minimization algorithm can be used. 
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This is generally accomplished by using a minimization 
routine, such as frprmn from Numerical Recipes (Press et al. 
1992) to measure the values of r, (equation 0) that minimize 
the fit parameter. Alternately, one can apply the variational 
calculus technique of Hernandez, Valls-Gabaud, & Gilmore 
(1999), although significant additions to their method must 
be made to adequately deal with observational effects and 
a function SFR(t,Z) that varies with two parameters. Since 
the variational calculus approach assumes SFR(t) to be con¬ 
tinuous on a timescale of 0.1 Gyr, and since it is demon¬ 
strated in the next section that the high resolution in age 
and metallicity that is required when using the variational 
calculus technique actually increases the uncertainties in the 
measurement, we will adopt the former technique. 

Regarding the specific choice of minimization routine, 
any of the general routines given in Numerical Recipes - 
amoeba, powell, dfpmin, or frprmn - will work, given triv¬ 
ial modifications to eliminate negative star formation rates. 
Downhill simplex (amoeba) and Powell’s method (powell) 
are the simplest, requiring only the ability to measure the 
fit parameter given any star formation history. As noted 
by Press et al. (1992), Powell’s method converges signifi¬ 
cantly faster than a downhill simplex, and thus is preferred. 
The remaining two algorithms are potentially faster, pro¬ 
vided that they can be supplied with the fit parameter and 
derivatives at any arbitrary star formation history and that 
the measurement of the derivatives takes less time than N 
computations of the fit parameter (N being the number of 
partial CMDs). Using the Poisson likelihood ratio defined in 
equation |Io] and the model CMD defined in equation the 
derivative is given by 


df df dim 

drk Ej y mi drk Ej rrii ° l ’ k 

i i 


( 21 ) 


The quantity 1 — — needs to be calculated only once in 
each CMD bin (and is the longest part of the calculation), 
allowing the full set of derivatives to be calculated in lit¬ 
tle more time than the fit parameter itself. In terms of 
the choice between the Davidon-Fletcher-Powell (DFP) al¬ 
gorithm and the Fletcher-Reeves-Polak-Ribiere (FRPR) al¬ 
gorithm, I have found that the DFP algorithm (as imple¬ 
mented by Press et al. 1992) frequently fails to converge, 
while the FRPR algorithm has no such difficulties; the rec¬ 
ommendation is thus for the use of the FRPR algorithm 
(frprmn in Press et al. 1992). 

This algorithm converges very quickly if near the min¬ 
imum (sufficiently close that the fit parameter becomes 
quadratic); however it can take some time if far away. A 
very fast way to provide a rough initial value is to start 
with all star formation rates set to zero and incrementally 
add to the rates whose gradients are the most negative. This 
requires scaling the partial CMDs to contain comparable 
numbers of stars, since otherwise the rates producing the 
most stars - rather than those most resembling the observed 
CMD will be filled in this technique. This scaling concern 
is also present in frprmn, as it is essentially a sophisticated 
steepest-descent algorithm. 

As both the number of iterations required and the time 
taken per iteration scale as N in the FRPR algorithm and 
in the initial seeding algorithm, the total time for conver¬ 
gence scales as N 2 . If one is using a very large number of 
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partial CMDs, it may be preferable to sacrifice accuracy 
for speed. An algorithm that will give a good (though not 
excellent) fit to the data while scaling as N is given here. 
Beginning with an initial guess (usually of a constant star 
formation rate), the following two-step iteration procedure 
is made until sufficient convergence is reached. The first step 
is a measurement of the model CMD (m;) using equation 
the second is an updating of the star formation rates re¬ 
using 


rj = fjo 


Cijm/rm 


( 22 ) 


This is a very crude algorithm, and does not converge to 
high precision quickly. However, it reaches moderate levels 
of convergence fast enough that, for N > 150, the speed 
improvement is significant and outweighs the small amount 
of accuracy lost. 


2.5 Determination of the Uncertainties 

Once one follows the process above and determines a “best 
fit”, the resulting numbers are the star formation rates cor¬ 
responding to each partial CMD and the overall fit param¬ 
eter. Arriving at these values is certainly important in star 
formation history studies, but two central questions remain 
unanswered: (1) how far from the best fit is the “true fit”, 
and (2) how good is the best fit. This and the following sec¬ 
tion address these questions. The “true fit” is defined here 
as the fit corresponding to the actual star formation history. 

The first question is of fundamental importance, be¬ 
cause merely quoting the “best-fitting star formation his¬ 
tory” is useless unless one also provides a measurement of 
the uncertainties. In Gaussian data fit using y 2 , for example, 
we know that the mean \ 2 (not reduced) difference between 
the underlying model and the best fit equals the number 
of free parameters in the fit. (This, of course, is why one 
uses the number of “degrees of freedom” - the number of 
measurements minus the number of parameters - when cal¬ 
culating a reduced x 2 -) 

For Poisson-distributed data, however, the expectation 
value of the Poisson likelihood ratio is not a constant value, 
but rather varies with the number of model points in each 
bin. Assuming that the fit is most driven by the bins con¬ 
tributing the most to the variance of the fit parameter, the 
mean difference between the fit parameter of the best fit and 
underlying model will be the sum of the expectation values 
of the first N fit parameters of the bins with the largest 
expected variances (N being the number of free parameters 
in the solution). This sounds more complex than it is, as 
the variance and expectation values can be calculated eas¬ 
ily for any number of model points. In practice, this value 
is generally a little more than N, as the expectation val¬ 
ues slightly exceed 1.0 where the variances are the highest. 
One can therefore approximate the difference as equaling 
the number of free parameters in the fit; a proper calcula¬ 
tion requires calculating the fit parameter expectation value 
and variance (both are functions of the number of model 
points) in each CMD bin. 

A brief comment regarding the determination of the 
number of free parameters should be made. Although, by 
definition, this should equal the total number of partial 
CMDs, plus one for any foreground or bad star CMD that 


was fit, plus one for any “fixed parameter” that was fit, the 
number is generally much smaller. There are two reasons for 
this - the restriction of non-negative star formation rates 
and the inclusion of partial CMDs that in no way resemble 
any part of the observed CMD. 

The effect of the first can be demonstrated easily. Fit¬ 
ting 10 Gaussian-distributed points, all with mean values of 
0, to the line y = a + bx with no restrictions on a and b re¬ 
turns a mean x 2 of 8.0 10 points minus 2 free parameters. 

Requiring a > 0 causes the mean x 2 to equal 8.5, effectively 
producing 1.5 free parameters since a < 0 half the time; the 
same is true of b. Finally, requiring both a > 0 and b > 0 
returns a mean of 9.32. The average number of free pa¬ 
rameters in this fit (2 if both a and b are positive, 1 if one is 
positive, and two if neither is positive) equals 0.68; thus the 
effective number of free parameters in a fit that restricts pa¬ 
rameters from being negative is simply equal to the number 
of positive parameters in the solution. 

Any partial CMDs that are completely orthogonal to 
the observed CMD likewise do not add to the effective 
number of free parameters. For example, one can again 
consider 10 Gaussian-distributed points, and fit to y = 
afi(x) + 6 / 2 ( 0 :). ff f 2 {x) is zero in the range of x values 
used in these points, the mean \ 2 is 9.0 even though there 
are technically two free parameters in the fit. Since, in a 
CMD fit, a partial CMD is not zero everywhere, 6 will be 
constrained (and forced to zero) by the lack of observed stars 
where its partial CMD is non-zero, we can again simply ig¬ 
nore any star formation rates that are measured to be zero 
when counting free parameters. 

The presence of nearly-degenerate isochrones, however, 
does not decrease the effective number of free parameters. 
For example, fitting 10 Gaussian-distributed points with x 
values between 0 and 9 to the curve y = a + bx + cx 1 000001 
returns a mean x 2 of 7.0, despite the nearly complete degen¬ 
eracy of the second and third terms (S ) 1 -® 30001 = 9.00 0 02). 
In practice, the parameters 6 and c are generally determined 
to be very large and opposite numbers, so the limitation of 
non-negative star formation rates will cause one to be zero 
and thus not counted as a free parameter. In either case, the 
presence of nearly-degenerate isochrones in the solution re¬ 
quires no additional effort in measuring the effective number 
of free parameters. 

Thus armed with knowledge of the minimized fit pa¬ 
rameter and the expected difference between the minimized 
fit parameter and the fit parameter of the underlying model, 
one can search all free parameters to determine the range 
of acceptable values. Measurement of the uncertainties in 
determinations of fixed parameters such as distance, extinc¬ 
tion, etc. are quite simple, as a separate solution must be 
made for each combination (because the partial CMDs will 
be different if any fixed parameters change). One can deter¬ 
mine the best fit at each trial distance, for example, and the 
range of distances producing fit values within the acceptable 
range gives the uncertainty in distance. 

Measurement of uncertainties of the star formation rate 
and metallicity, however, requires slightly more work. The 
uncertainties come from two sources - uncertainties in the 
fixed parameters and acceptable ranges within any one fit 
- which must be added in quadrature. The first source is 
easy to quantify as one can simply find the extreme values 
of star formation rates and metallicities in the fits returning 
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acceptable fit parameters. The second source of uncertainty 
is more difficult, and requires a trial-and-error testing of the 
acceptable range. During such a test, a parameter should be 
fixed to a variety of values, with the other parameters re¬ 
solved to minimize the fit. By allowing the other parameters 
to vary, one will find the true uncertainty for that parameter, 
including the effects of correlated errors (as is usually seen 
in adjacent isochrones). 

An alternate approach to the problem is to use a Monte 
Carlo test. The usual method for doing this is to build a 
large number of simulated observed CMDs, using the best- 
fit fixed parameters and star formation history and solve 
for the histories of those CMDs. Any difference between the 
average solved values and the input values indicates a bias 
in the solution, while the scatter indicates the uncertainties. 
Such a test is not, in fact, correct in the strictest sense, 
as one should actually attempt to find the range of input 
parameters that produce the same star formation history 
as did the observed data. However, if one’s star formation 
history routine is unbiased (which can be tested by sol ving 
simulated data, as is done below in sections [0] and [i.jj ) 
the Monte Carlo test should provide a reasonably accurate 
estimate of the uncertainties. If the solution is biased (such 
as if one is using a x 2 fit); the Monte Carlo test cannot be 
used reliably. 

A final caution should be given against arbitrarily high 
precision in the recovered star formation history. Hernandez 
et al. (2000) measure star formation rates from 0 — 15 Gyr 
with steps of 0.1 Gyr, thus effectively using 150 free param¬ 
eters in the fit. For the same objects they studied, I will 
use 11 age bins in my solutions. The coarser binning strat¬ 
egy produces smaller uncertainties for two reasons. First, 
the acceptable range in the fit parameter increases linearly 
with the number of free parameters; as the fit parameter 
essentially goes as the square of errors in the parameters, 
using 11 free parameters instead of 150 decreases uncertain¬ 
ties by a factor of 3.7. Second, a bin of 0.1 Gyr can have its 
star formation rate varied quite severely before generating a 
bad fit; bins of 1 Gyr or more have much less freedom. This 
contributes another factor of N to the uncertainties (not 
Vn, since errors in adjacent bins are correlated), giving the 
coarse (11-bin) fit another factor of 13.6 improvement in pre¬ 
cision, or a total of a factor of 50 improvement in the error 
bars. As the solutions presented below are generally 1 — 2a 
detections using an 11-bin resolution (thus necessitating yet 
lower resolution), an accurate measurement of the Hernan¬ 
dez et al. (2000) uncertainties would indicate that each point 
on their curves has a true uncertainty of at leat 10 times its 
measured value! 


2.6 Measurement of the Fit Quality 

The final main issue that must be addressed here is that 
of the goodness-of-fit. This is an entirely different question 
than that discussed in the previous section. The last sec¬ 
tion discussed the fit parameters of various solutions given 
one observed data set; this section discusses the fit param¬ 
eters of various observed data sets drawn from the same 
solution. For example, again using the analogy of Gaussian- 
distributed data, the mean difference of y 2 between solution 
producing the best fit and the solution from which the data 
were drawn equals the number of free parameters in the so- 
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lution. However, the variance of x for many data sets drawn 
from the same model distribution equals twice the number 
of degrees of freedom. 

The goodness-of-fit can be quantified in at least two 
ways - using a percentile or a number of a of error - either 
of which is acceptable. The first test is done by generating a 
large number of fit parameters for artificial data that were 
drawn randomly from the best-fitting model. Since one al¬ 
ready knows the number of model points in each CMD bin 
(determined during the minimization process), this can be 
done very quickly. By determining where or if the minimized 
fit parameter (plus the correction for t he number of free pa¬ 
rameters, as described in section 2.5) falls within the his¬ 
togram of random drawings, one can ascertain the goodness 
of the fit. Such a technique was used by Hernandez et al. 
( 2000 ). 

The second test can be also done easily, as the expecta¬ 
tion values and variances of the fit paramete r in each CMD 
bin were determined as described in section 2.5, By adding 
these, a combined expectation value and variance can be 
determined for the best-fitting model. To quantify the fit 
quality, the minimized fit parameter (again corrected for the 
number of free parameters) can be framed in terms of a away 
from an ideal fit, using the following definition: 


Q = 


fit parametei — expectation value 


V 


variance 


(23) 


If Q is zero, the data represent a typical random drawing 
from the best model. If Q is one, the data are la worse than 
a typical random drawing from the best model. Since most 
scientists are more familiar with \ 2 values, I also define an 
“effective x 2 ” : 


Xeff — 1 + Q\f2/N , 


(24) 


where N should equal the number of CMD bins containing 
either stars or some minimum number of model points. (N 
could also be defined as the total number of CMD bins, but 
a bin with zero model points and zero observed points does 
not contribute to the Poisson likelihood ratio and thus one 
could obtain arbitrarily good Xeff values by making a vast 
CMD.) 


2.7 Binning the CMD 

Until now, the question of the exact technique for CMD 
binning has not been mentioned, as it was not relevant. The 
techniques described earlier - synthetic CMD generation, 
the Poisson likelihood ratio, and the methods for determin¬ 
ing uncertainties and goodness-of-fit - are valid for any bin¬ 
ning scheme. The obvious choice of a binning size is suffi¬ 
ciently small that CMD features are not lost, with the bin 
size comparable to the size of the smallest features to which 
one wishes to be sensitive; this is also the condition neces¬ 
sary for equivalence between binned and unbinned statistics 
as noted above. 

Occasionally, one will find that the fit needs to be 
weighted towards large-scale features in order to (for ex¬ 
ample) place the red giant branch (RGB) in the correct po¬ 
sition. Since the acceptable range of fit parameters (Section 
2.5) is a function primarily of the number of free parame¬ 
ters, fit parameters calculated at various binning sizes can 
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be averaged together and treated normally. (One must ac¬ 
count for this when measuring uncertainties and fit quality, 
of course.) 

To consider the effect of two bin sizes, consider the fol¬ 
lowing two examples. In the first case, there is a 4 x 4 block 
of CMD bins, all with observed values lcr above the model 
values plus random scatter of lcr. (For simplicity, this discus¬ 
sion will be in terms of Gaussian-distributed data and \ 2 ; 
the principles hold true for Poisson-distributed data.) The 
X 2 from this set of bins is 2 per bin, fit a total of y 2 = 32 
compared with an expectation value of x 2 = 16 for 16 bins. 
In contrast, consider the same block of bins, with half lcr 
high and half lcr low. This likewise has a y 2 of 32. Now 
combining the 4x4 block of bins into a single bin, we re¬ 
examine the two cases. In the first, the y 2 value is 17, again 
16 higher than the expectation value of x 2 = 1 for 1 bin. In 
the second, however, the x 2 value is 1, equal to the expecta¬ 
tion value. The conclusion of this exercise is that increasing 
the bin sizes does not increase the sensitivity of the fit pa¬ 
rameter to large features; rather it decreases the sensitivity 
of the fit parameter to small features. This may be necessary 
in some instances, but it should be done with caution. 

The final issue regarding binning schemes is whether 
to use “smooth binning” (dividing the entire CMD into 
equal-sized rectangular bins) or “irregular binning” (select¬ 
ing specially-shaped bins for different CMD regions). As 
mentioned previously, the Poisson likelihood ratio is valid 
for any binning scheme, so either should work. Smooth bin¬ 
ning is statistically advantageous in that they make no a 
priori assumptions regarding the data and generally pro¬ 
duce more degrees of freedom in the fit (and is used in this 
work); irregular binning can better account for known errors 
or uncertainties in the data or theoretical isochrones. 


2.8 Incorporation of Outside Information 

The fitting procedure described in this paper makes no 
a priori assumptions regarding the distance, extinction, 
SFR(t,Z), etc. While this allows a pure, unbiased estimate 
of these parameters, there are frequently constraints that 
should be applied in order to measure the star formation 
history as accurately as is possible. The simplistic approach 
would be to limit the search space to correspond to the max¬ 
imum allowable values of distance, extinction, metallicity, 
etc. However, the incorporation of outside information can 
be done in a way more consistent with the “maximum like¬ 
lihood” approach that is favored here. Recalling that the 
fit parameter is merely —2 In (probability), one can factor in 
the probabilities of the trial star formation history matching 
other observational data in the same way. 

For example, if the mean metallicity of a galaxy is 
known to be ([Fe/H]} = —1.5 ± 0.3 with Gaussian errors, 
but the trial fit has a mean metallicity of [Fe/H] = —2.1, one 
can add 4.0 to the fit parameter, thus giving the metallicities 
equal weight as the photometry. (The value of four comes 
from the Gaussian likelihood ratio, y 2 .) In practice, there 
are several complications to using spectroscopic metallicity 
information. First, the stars used in spectroscopic surveys 
(red giants, for example) are generally not representative 
of all stars in the galaxy. Specifically, red giants are older 
than upper main sequence stars, so using ([Fe/H]) of ages 
older than 2 Gyr may be more appropriate than a ([Fe/H]) 


of all ages. Second, most objects show significant spreads 
in metallicity; it would be preferable to compare the his¬ 
tograms of metallicities rather than just the means. Finally, 
one must be careful about what one means by “metallic¬ 
ity”. The metallicity used in the models is [M/H]; that com¬ 
ing from spectroscopic studies is generally the abundance of 
one or more elements, such as [Fe/H] or [Ca/H], and is not 
necessarily the same value. 

However, after accounting for all of these possibilities, 
constraints from spectroscopic studies, variable star distance 
measurements, extinction maps, etc. can (and should) be 
added to the fit parameter, producing a combined fit param¬ 
eter. This will allow one to answer not only the question of 
“how well does the star formation history match the present 
photometry”, but “how well does the star formation history 
match all known information about this galaxy.” Answering 
second question clearly provides more stringent constraints 
on the solution than the first. 


3 APPLICATION TO DATA 


Having detailed a method for the determination of star for¬ 
mation histories, we now study its application to data sets, 
both simulated and real. The tests with simulated data will 
examine how well star formation histories can be measured 
when the isochrones are a perfect match to the data; the 
tests with real data allow us to examine the effects of pos¬ 
sible errors in the theoretical isochrones. Given that there 
are certainly errors at some level, the question that must be 
answered is whether or not one can obtain a reasonable star 
formation history given these very good but imperfect the¬ 
oretical models. The Galactic dwarf spheroidal companions 
provide ideal targets for this test, because they (1) are suffi¬ 
ciently close that the ancient main sequence turnoff (MSTO) 
is visible, (2) are sufficiently far that line-of-sight depth is 
not a large problem, (3) have little dust to cause internal 
reddening, and (4) the majority have relatively simple star 
formation histories. 

The procedure used to measure star formation histories 
was identical to that described in section 2, except that no 
incorporation of outside data (as described in section 2.£) 
was made. This limitation was intentional, as the purpose of 
these tests is to determine the capabilities of CMD analysis 
alone. (At any rate, given that the WFPC2 field of view is 
much smaller than the galaxies, we cannot expect to obtain 
star formation histories as good as those obtained from wide- 
field ground-based images.) 


3.1 Synthetic Galaxy 1: Single-Population 

The first test of the method is an attempt to reconstruct 
a simple single-burst population, with the CMD shown in 
Figure |l]. To provide a reasonable comparison to Leo II, 
a true distance modulus of 21.60 with zero extinction, as 
well as the Leo II artificial star library, was used when 
generating the synthetic data. The stars were distributed 
evenly in age between 11 and 12 Gyr, and in metallicity 
between [M/H] = —1.75 and —1.65 - here and elsewhere, 
[M/H] = log on the scale of the Girardi et al. (2000) 
isochrones - effectively creating a single-population system 
(in terms of galaxy field populations). The total number of 
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Figure 1. CMD of Synthetic Galaxy 1. The isochrone corre¬ 
sponds to the mean age and metallicity of the galaxy, 11.5 Gyr 
and [M/H] = -1.7. 


stars in the CMD is 16449. This galaxy will also serve as an 
illustration of the analysis procedure. 

The first set of decisions that must be made is the CMD 
region to study and the binning size. In order to retain infor¬ 
mation regarding the oldest MSTO stars while eliminating 
the stars with the worst photometric error, faint-end cuts of 
V < 25.0 and I < 24.5 will be used in this solution. On the 
bright end, we cut off where the incompleteness due to sat¬ 
uration reaches 50%, which gives requirements of V > 17.5 
and I > 16.5. A CMD binning size of 0.05 magnitudes in V 
by 0.025 magnitudes in (V — I) is sufficiently small to ensure 
that all CMD information is retained. (The 1x2 shape of the 
rectangles was chosen to give similar sensitivity in both dis¬ 
tance and extinction, as E(V — I)/Ay is slightly more than 
0.4.) The “observed” CMD, binned accordingly, is shown in 
Figure |[ 

The second set of decisions will be the parameter space 
to explore. Because we are attempting to ascertain the de¬ 
gree of accuracy with which a galaxy’s star formation his¬ 
tory can be recovered, we will attempt a solution with very 
high resolution (higher than will be used below when study¬ 
ing the real galaxies). The metallicity stepsize will be 0.1 
dex in [M/H], the age stepsize will range from 0.3 Gyr at 
young ages to 1 Gyr at old ages, and the distance modulus 
and extinction (Ay) will both be solved with a resolution of 
0.02 magnitudes. Because we have not retained any of the 
lower main sequence, it will be impossible to determine the 
IMF or binary distribution; these values have been fixed at 
a Salpeter slope and a binary fraction of 40% with flat sec¬ 
ondary mass function. Using these parameters, there are 19 
time bins and 19 metallicity bins, for a total of 361 partial 
CMDs. The pure partial CMD corresponding to the input 
age and metallicity is shown in Figure [^. After application 
of observational errors (from the artificial star tests), the 
pure partial CMD becomes the final partial CMD shown in 
Figure [i| 
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Figure 2. “Observed” CMD of synthetic galaxy 1, shown in 
greyscale. The limits on the plot are those used in the solution: 
17.5 < V < 25.0 and -0.3 <V — I < 2.0. 



Figure 3. Pure model CMD (shown in greyscale), calculated 
with an age of 11 — 12 Gyr and metallicity of [M/H] = —1.75 
to —1.65. The limits on the plot are those used in the solution: 
17.5 < V < 25.0 and -0.3 < V - I < 2.0. 


A comparison of Figures and ^ shows that the sim¬ 
ulated observed CMD was indeed drawn from the model 
CMD, in that the overall shape and density of points are the 
same, and no bin with a model value of zero has a nonzero 
number of observed points. (The last point is not obvious 
from the printed images, given the limitations of greyscal¬ 
ing.) Because of this, it is possible to make a statistically- 
valid fit of the simulated observations given the ensemble 
of partial model CMDs. Solving for SFR(t,Z) at a variety 
of distance/extinction combinations, I was able to measure 
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Figure 4. Partial CMD (shown in greyscale), calculated with an 
age of 11 —12 Gyr and metallicity of [M/H] = —1.75 to —1.65. The 
limits on the plot are those used in the solution: 17.5 < V < 25.0 
and -0.3 < V - I < 2.0. 

a minimized fit parameter and its corresponding distance, 
extinction, and star formation history. 

The number of effective free parameters in the solution 
is 9 - 7 star formation rates returned non-zero values, plus 
distance and extinction. The mean difference between the fit 
parameters of the best fit and underlying star formation his¬ 
tory is 9.8, meaning that the error bars are given by all star 
formation histories whose fit parameters are within 9.8 of the 
minimum (1088.1). The distance ((m — M)o = 21.60±0.02) 
and extinction {Ay = 0.00±0.01) can be immediately deter¬ 
mined based on the best fits at each distance and extinction; 
both values agree with the input values. The measured star 
formation rate is shown in panel b of Figure and matches 
the input star formation history extremely well. At the la 
level, there has been a small amount of bleeding from the 
11 — 12 Gyr bin into adjacent bins; however the input and 
recovered histories are consistent at the 2a level and the 
bleeding amounts to a loss of only 3% of the star formation 
in the peak bin. The metallicity was measured correctly, 
with a determined value of [M/H] = —1.70 ± 0.05 dex. 

In order to determine the quality of the fit, the expec¬ 
tation value of the fit parameter (1089.9) and its expected 
variance (1937.1) must be calculated. Comparing with the 
corrected fit parameter value of 1097.9, this translates into a 
goodness-of-fit value of Q = 0.18, or Xeff = 1.01, meaning a 
statistically consistent fit to the “observed” data. (In terms 
of percentiles, the fit is consistent at the 44% level, meaning 
that it is better than 44% of random drawings.) 

The chance of having a good fit is enhanced, of course, 
as the binning of both age and metallicity matches that used 
when creating the model. Whether or not a bad choice of 
bins affects the star formation history can be tested by run¬ 
ning a solution using a different binning scheme. By solv¬ 
ing with the logarithmic age scheme used for the observed 
galaxies, one gets a worst-case estimate (as the break be¬ 
tween the oldest two bins falls at the age of this system). In 



Age (Gyr) 


Figure 5. Star formation histories of Synthetic Galaxy 1. Panel 
a is the input history, panel b is the measured history using the 
entire CMD, panel c is the measured history without the turnoff, 
and panel d is the measured history with only the upper RGB. 
Rates are given relative to the lifetime average rate of 3.44 X 
10—5MQ2/r _ 1. 

making this test, I found a very poor fit quality (Q = 7.19 
and Xeff = 1-37), but measured the correct distance, ex¬ 
tinction, and star formation history. The conclusion is thus 
that binning choices can hurt the fit quality, but are unlikely 
to affect the measured values or their uncertainties. This re¬ 
sult is not entirely a surprise, as the nature of the maximum 
likelihood ratio causes the solution to attempt to match all 
observed points with model points, even if this causes other 
model points to fall where observed points do not. (For ex¬ 
ample, finding one star where the model predicts zero stars 
has a probability of 0.0, while finding no stars where one 
is predicted has a probability of 0.37.) Thus all component 
populations will be fit. 

In order to apply this test to more distant galaxies, it 
is also important to answer the question of how well the 
input star formation history can be recovered in more dis¬ 
tant galaxies where the ancient main sequence turnoffs are 
not present. To accomplish this, the star formation histo¬ 
ries were calculated using photometric cutoffs brighter by 
1.5 and 3.5 magnitudes in both V and I. Increasing the cut¬ 
offs by 1.5 magnitudes limits the solution to the RGB and 
HB; increasing by 3.5 limits it to just the upper RGB. The 
measured star formation histories from these solutions are 
shown in panels c and d of Figure 

The solution with the full RGB and horizontal branch 
(HB) was extremely successful, measuring the distance 
(( m — M) o = 21.60 ± 0.02), extinction (Ay = 0.00 ± 0.01), 
star formation history, and mean metallicity ([M/H] = 
— 1.70 ± 0.05) with nearly the same accuracy as the fit to 
the entire CMD. In fact, increasing the number of stars by 
a factor of 4 to compensate for the 1.5 magnitudes lost (1.5 
magnitudes of distance modulus equals a factor of 2 in dis¬ 
tance) would have produced a final solution to the same 
level of accuracy as the full fit. The reason for the equally- 
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accurate solution is threefold. First, despite losing the main 
sequence turnoff, nearly all evolved stars remain above the 
photometric cutoff. Second, age and metallicity are not com¬ 
pletely degenerate on the RGB, allowing a sensitive numeri¬ 
cal fit to ascertain the correct star formation history. Finally, 
the HB morphology is sensitive to age, which also helps to 
break the degeneracy. The last reason is as much a help as a 
hindrance, however, as the theoretical isochrones generally 
have much greater systematic uncertainties in the HB than 
in the RGB. 

The solution measuring the upper RGB alone lost a 
great deal of information. Although the distance and ex¬ 
tinction measurements were both accurate ((m — M) o = 
21.64 ± 0.15 and Av = 0.00 ± 0.03) they were not precise. 
Increasing the number of stars by a factor of 25 to compen¬ 
sate for the greater distance would reduce the distance and 
extinction uncertainties, but would not greatly improve the 
star formation history measurement. Specifically, with only 
the upper RGB, an acceptable fit can be obtained with stars 
of any age between 7 and 15 Gyr, due to the near-degeneracy 
of age and metallicity. 

3.2 Synthetic Galaxy 2: Composite-Population 

The second synthetic galaxy, whose CMD is shown in Figure 
|(| is a system with a more complex star formation history. 
The same input parameters were used, except for the metal¬ 
licity and age distribution. As is clear from the CMD, the 
star formation history consists of three bursts: 

(i) 0.6 to 1.0 Gyr, [M/H] = —1.0, ~ 5000 stars 

(ii) 2 to 5 Gyr, [M/H] increasing from —1.4 to —1.2, ~ 
11000 stars 

(iii) 8 to 13 Gyr, [M/H] increasing from —1.7 to —1.6, 
~ 16000 stars 

This star formation history was chosen to maximize the pos¬ 
sibility of error in the solution. The metallicity enrichment 
law used here exactly matches the age-metallicity pseudo¬ 
degeneracy, in that the RGB colour is the same at for stars 
of all ages. (This is the reason for the metallicity skips be¬ 
tween the bursts.) Likewise, the system does not contain 
stars of extreme ages (0 or 15 Gyr), allowing the solution 
to err in finding such stars. Finally, the three bursts have 
different shapes. The young burst has a constant star for¬ 
mation rate, the middle burst has the lowest rate in the 
middle of the burst, and the oldest burst has highest rate in 
the middle. The total number of stars in the CMD is 32518. 

The recovered star formation history, given in panels b 
and c of Figure |t|, has a minimized fit parameter of 2990.3. 
Because of the more complex star formation history of this 
galaxy, the number of free parameters in the fit was larger 
(37 star formation rates plus distance and extinction = 
39), which produced a maximum acceptable fit parameter 
of 3032.7. The fit quality is not excellent (Q = 1.80 and 
Xeff = 1-06), but is consistent at better than a 2a (5%) 
level. 

From an examination of the history in panel c of Figure 
|t| it is clear that there are rather large uncertainties in the 
star formation rates caused by the ability of of the prolonged 
star formation episodes to be modeled acceptably using dif¬ 
ferent combinations of populations. (This was not an issue 
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Figure 6. CMD of Synthetic Galaxy 2. The isochrones corre¬ 
spond to the mean ages and metallicities of the three bursts. The 
youngest is 0.8 Gyr and [M/H] = —1.0, the middle is 3.6 Gyr and 
[M/H] = —1.3, and the oldest is 10.0 Gyr and [M/H] = —1.6. 
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Figure 7. Star formation histories of Synthetic Galaxy 2. Panel 
a is the input history, panel b is the measured history using the 
entire CMD, panel c is the measured history using the entire 
CMD, rebinned for smaller uncertainties, panel d is the measured 
history without the turnoff, and panel e is the measured history 
with only the upper RGB. Rates are given relative to the lifetime 
average rate of 4.68 x 10—5M©2/r — 1. 

for galaxy 1, since there was only a single burst.) For exam¬ 
ple, the star formation measured in the 2.5 — 3 Gyr bin, while 
a correct measurement of the input value, can be moved to 
the adjacent bins without significantly hurting the quality 
of the fit. Thus while the input star formation history was 
recovered correctly, we do not have the ability to recover 
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Table 1. Observations 


Galaxy 

Prog ID 

Date 

F555W Exposures 

F814W Exposures 

reference 

Carina 

GTO 5637 

Jan 1995 

200s, 2 x 1100s 

200s, 2x 1100s 

PI Westphal 

Draco 

GTO 6234 

Jun 1995 

200s, 2xl000s“ 

200s, 1100s, 1300s 

Grillmair et al. 1998 

Leo I 

GO 5350 

Mar 1994 

350s, 3 x 1900s 

300s, 3 x 1600s 

Gallart et al. 1999a 

Leo II 

GO 5386 

May 1994 

2 x80s, 8 x 600s 

2 x80s, 8 x 600s 

Mighell &; Rich 1996 

Sagittarius^ 

GO 6614 

May—Oct 1996 

6 x 160s, 2 x 600s 

5 x 160s, 2 x 500s 

Mighell et al. 1997 

Sculptor 

GTO 6866 

Dec 1997 

2 x 1200s, 2 x 1300s 

4 x 1300s 

Monkiewicz et al. 1999 

Ursa Minor 

GTO 6282 

Jul 1995 

200s, 2xll00s“ 

200s, 2 x 1100s 

PI Westphal 


“F606W was used for Draco instead of F555W 

6 Six pointings were obtained in Sagittarius, comprising of three pairs of partially-overlapping fields. One pair is 0.2° from the centre, 
the second is 2.4° from the centre, and the third is a field containing only Galactic foreground stars. 


the structure of the bursts with a high signal-to-noise. In 
order to compensate for this lack of time resolution in the 
bursts, panel c of Figure |j] shows the star formation history 
after additional binning in a somewhat logarithmic scheme. 
As mentioned in section [2.5[ the error bars tend to drop by 
roughly a factor of 2 (rather than by V2) when combining 
two bins, because errors in adjacent bins are correlated. As 
with Synthetic Galaxy 1, the metallicity was measured very 
well, to an accuracy of rougly 0.15 dex in highly-populated 
age bins and 0.3 dex in less-populated bins. 

As with galaxy 1, I have run additional star formation 
history solutions with photometric cutoffs that are brighter 
by 1.5 and 3.5 magnitudes. These results are shown in pan¬ 
els d and e of Figure Once again, the ability to mea¬ 
sure the distance ((m — M)o = 21.59 ± 0.05), extinction 
(Ay = 0.00 ± 0.03) and star formation history is essen¬ 
tially undiminished when subtracting 1.5 magnitudes from 
the photometric cutoff. Even without increasing the number 
of stars (the 1.5 magnitudes of loss again corresponding to a 
factor of 2 in distance, or a factor of 4 in number of stars in 
the field of view), all input parameters were correctly mea¬ 
sured. However, the solution with 3.5 magnitude lost was 
poorly constrained, with large amounts of star formation 
again falling into adjacent bins. All values were measured 
accurately given the uncertainties, of course, but the uncer¬ 
tainties were extremely large. 

The conclusion from the limited-photometry solutions 
is that, while it is always preferable to have photometry 
reaching to ancient main sequence turnoffs, star formation 
histories can be measured accurately when the photometry 
reaches only My = +2. With more restricted photometry, 
however, only broad features of the star formation history 
(perhaps a time resolution of logt = 0.5) can be obtained. 

3.3 Real Galaxy: Leo II 

After analysing the pair of synthetic galaxies, we finally turn 
our attention to the measurement of star formation histories 
of a real galaxy. Leo II will provide the primary example, 
because it is the only system in this sample with many stars 
and a primarily-old star formation history. 

The data were all obtained from the Hubble Space Tele¬ 
scope archive using OTFC; all data sets are non-proprietary. 
The list of observations is given in Table |l]. The data were 
cosmic-ray cleaned and combined using the crclean algo¬ 
rithm of HSTphot (Dolphin 2000b), which is able to com- 



Figure 8. Observed (V—I),V CMD of Leo II; N=12642; N (My < 
4)=5188. The overplotted isochrone corresponds to the mean val¬ 
ues of the best solution: [Fe/H] = —1.13 and t = 9.44 Gyr. 


bine exposures at different gain settings, to provide one deep 
F555W image and one deep F814W image. 

HSTphot was then used to obtain stellar photometry 
and artificial star tests using 43600 artficial stars. Stars 
(both real and artificial) were required to have x < 2.5, 
S/N > 5, and |sharpness| < 0.3 in order to be considered 
detections. Aperture corrections were determined separately 
for each chip and image, and are accurate to half a percent 
(the typical solution had 45 bright stars with an rms scatter 
of 0.03 magnitudes). CTE loss corrections and transforma¬ 
tions to VI were made using an updated calibration solution 
from that given by Dolphin (2000c); the new equations are 
available on the author’s web site. 

The Leo II CMD is shown in Figure 0. The 50% com¬ 
pleteness regimes, determined using artificial star tests, are 
15.9 < V < 27.1 and 14.8 < I < 26.1 (the bright limit deter¬ 
mined by saturation and the faint limit by loss of photons). 

Table 0 lists literature values of the distance, extinc¬ 
tion, and RGB metallicity, as well as semi-empirical mea¬ 
surements of the distance and extinction calculated from my 
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Table 2. Distance and Extinction Values 


Galaxy 

Literature 
(m - M)g A b y 

[M/H] c 

Semi-Empirical 
(m — M) o Ay 

CMD Fitting 
(m — M) o Ay 

Carina 

20.03 ±0.09 

0.20 

-2.0 ±0.2 

19.94 ±0.20 d 

0.19 ± 0.10 e 

20.19 ± 0.13 

0.00 ±0.14 

Draco 

19.58 ±0.15 

0.08 

-2.0 ±0.15 

19.60 ±0.18 / 

0.27 ± 0.15 e 

19.49 ±0.11 

0.28 ±0.08 

Leo I 

21.99 ±0.20 

0.11 

-1.5 ±0.4 

21.84 ±0.14 d 

0.02 ± 0.05® 

21.80 ±0.06 

0.04 ±0.05 

Leo II 

21.63 ±0.09 

0.06 

— 1.9 ±0.1 

21.67 ±0.10^ 

0.12 ± 0.07 e 

21.55 ±0.08 

0.00 ±0.09 

Sagittarius (central) 

17.20 ±0.15 9 

0.47 

—0.3 ± 0.'2 h 



17.11 ±0.14 

0.46 ±0.11 

Sagittarius (outer) 

17.20 ±0.15 9 

0.37 

-0.3±0.2' 1 



17.09 ±0.17 

0.45 ±0.13 

Sculptor 

19.54 ±0.08 

0.06 

-1.8 ±0.1 



19.45 ±0.31 

0.06 ±0.19 

Ursa Minor 

19.14 ±0.10* 

0.10 

-2.2 ±0.1 

19.28 ± 0.25^ 


19.16 ±0.11 

0.12 ±0.09 


“Except where noted, distances were taken from Mateo’s (1998) compilation of literature values. 
b Calculated using the maps of Schlegel, Finkbeiner, &; Davis (1998) 

“Except where noted, metallicities were taken primarily from Mateo’s (1998) compilation of literature values of [Fe/H]. 
^Measured using both the RGB tip, calibrated with the Girardi et al. (2000) isochrones, and the red clump technique described by 

Dolphin et al. (2001b) and Girardi & Salaris (2001). 

“Measured using the RGB color, as per Sarajedini (1994). 

■^Measured using the horizontal branch magnitude, with the calibration of Carretta et al. (2000). 

9 From Bellazzini, Ferraro, & Buonanno (1999a) 

^From Alard (2001), Bonifacio et al. (2000), and Cole (2001). 
includes a more recent measurement by Mighell & Burke (1999). 


CMDs (when possible). A sanity check on the photometric 
calibration is that the values agree; in the case of Leo II this 
is true. Table ^jalso lists the values obtained from the CMD- 
fitting algorithm; a sanity check on the fit is the agreement 
between the semi-empirical and CMD-htting values. 

The CMD of Leo II contains a total of 12642 stars, of 
which 5188 are brighter than My = +4 (V = 25.6) and are 
thus useful in measuring the star formation history. With the 
brightest stars in an ancient, metal-poor population falling 
near My = —3 (V = 18.6), the photometry limits thus 
encompass the necessary range. The CMD shows the basic 
features of an old population - a strong horizontal branch 
and weak upper main sequence. However, the width of the 
main sequence turnoff region, presence of main sequence 
stars above the turnoff, and stars in the red clump region 
(above the red horizontal branch) all indicate the presence 
of a younger stellar population as well. 

The star formation history analysis technique used for 
the two synthetic galaxies was then applied to Leo II. The 
minimized ht parameter was 2335.9, with acceptable values 
as up to 2374.9. The fit quality parameters were Q = 2.15 
and Xeff — 1-09, meaning that the ht is acceptable at the 
2.15 a level. 

While the synthetic galaxies were constructed purely for 
the purpose of testing the method, we hope to obtain some 
basic scientific results from the studies of the real galaxies. 
The age resolution for the Leo II solution was logarithmic, 
since isochrones are more evenly spaced in log! than in t. 
In order to reduce the number of free parameters, these so¬ 
lutions used a metallicity resolution of 0.15 dex instead of 
0.1 dex and 11 time bins rather than 19. However, because 
of the possibility of foreground stars and bad stars, we in¬ 
cluded all three sources of contamination - a foreground star 
CMD, a bad star CMD consisting of a completely random 
distribution, and a bad star CMD consisting of the observed 
data smoothed with a Gaussian kernel with cr = 0.2 magni¬ 
tudes in (V — I) and 0.4 magnitudes in V. As should be clear 
from the lack of distant outliers in the CMD in Figure the 



Figure 9. Star formation and chemical enrichment histories of 
Leo II. The top panel shows the star formation rate, normalized 
to the liftime average rate of 3.8 X 10— 5MQyr~ 1. The bottom 
panel shows the chemical enrichment history. Although the solu¬ 
tion was made with an age resolution of 0.15 dex, this and the 
following figures are plotted with a resolution of 0.3 dex to make 
the features clearer. 


first and second sources are negligible (and were measured 
to be zero), while a small contribution from the smoothed 
observed CMD was needed to produce the best fit. 

The measured star formation history is shown in Fig¬ 
ure |(i|. The obvious feature of this star formation history is 
a prolonged star formation epoch, lasting from 15 Gyr ago 
until about 5-6 Gyr ago. The break at 5-6 Gyr is quite clear; 
the mean star formation rate at older ages is 7.3 times that 
at younger ages. This finding is consistent with the star for¬ 
mation history recovered by Mighell & Rich (1996), but is 
more extended (and older) than that found by Hernandez et 
al. (2000). The reason for the discrepancy is unclear, but it 
should be noted that the younger history measured by Her¬ 
nandez et al. (2000) cannot create the observed blue HB; 
thus some amount of older stars is necessary. It is also ap¬ 
parent from the 1.75<r detection of star formation in the 2 — 4 
Gyr bin that star formation extended until between 2 and 4 
Gyr ago. 
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The measured metallicity values are surprisingly high. 
The photometric metallicity measurement by Mighell & 
Rich (1996), using the technique proposed by Sarajedini 
(1994) applied to these same data, gave a value of [Fe/H] = 
—1.60±0.25, while the preferred mean metallicity measured 
here is — 1.13lo3i “ a difference of nearly half a dex. It 
should be noted, however, that the [Fe/H] = —1.58 red gi¬ 
ant branch (M 2) of Da Costa & Armandroff (1990) used 
by Mighell & Rich has the same colour as an interpolated 
Girardi et al. (2000) isochrone of metallicity [Fe/H] = —1.27 
and logt = 10.15, consistent with the metallicities measured 
by Carretta & Gratton (1997) for the similar-metallicity 
clusters NGC 3201, M 10, and NGC 6752. Additionally, one 
must account for the fact that the mean age of a star in Leo 
II (9.4 Gyr) is much younger than that for a typical globular 
cluster; the same-colour isochrone at this age has a metallic¬ 
ity of [Fe/H] = —1.21. Thus the metallicity measurements 
are entirely consistent; it is the scale used for calibration 
that is different. That the present fit is internally-consistent 
is demonstrated by Figure |i|, which shows the interpolated 
isochrone corresponding to the measured distance, extinc¬ 
tion, mean age, and mean metallicity. 

In summary, the primary result is that the CMD-fitting 
algorithm produced the expected distance, extinction, and 
star formation history to within the uncertainties. In spe¬ 
cific, the extended star formation observed in other studies 
was recovered accurately, with > lcr detections at old ages. 
Additionally, a small amount of younger star formation was 
detected at the > lcr level. Although the measured metal¬ 
licity is much higher than values found in the literature, it 
is consistent once systematic differences in the calibrations 
have been corrected. 


4 SIX DWARF SPHEROIDALS 

In this section, I cover briefly the solutions for seven ad¬ 
ditional observed CMDs of six galaxies. Each CMD will 
present a somewhat different challenge, with varying num¬ 
bers of stars, complexity of star formation, and amount of 
foreground contamination. All were reduced identically with 
HSTphot and the CMD analysis program. 


4.1 Draco 

The first dwarf examined will be the Draco dwarf spheroidal. 
From a cursory examination of the CMD (Figure |l^), one 
expects a simple (mostly-old) star formation history, as the 
turnoff and RGB are both extremely narrow. A possible sign 
of trouble is that the semi-empirical extinction measurement 
is significantly greater than the literature value; this is pos¬ 
sibly because of the filters used (F606W instead of the stan¬ 
dard F555W). However, a zero point error will affect the 
measured distance and extinction, but not the determined 
star formation history. Note that stars within half a mag¬ 
nitude of the RGB tip would be saturated, but this should 
not significantly affect the CMD solution as the upper RGB 
is not strongly populated. 

The measured star formation history is shown in panel 
a of Figure [ll| As was expected from the visual examination 
of the CMD, the only significant star formation episode ap¬ 
pears to be at ancient ages (> 11 Gyr ago). The constraints 
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Figure 10. Observed (V — I),V CMD of Draco; N=3371; 
N(M V < 4)=285. 
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Figure 11. Star formation histories of three old systems: Draco, 
Ursa Minor, and Sculptor. Each is normalized relative to its life¬ 
time average star formation rate. 


on the maximum amount of younger star formation are given 
by the upper error bars; there is the possibility of a signifi¬ 
cant amount of star formation (more than the lifetime aver¬ 
age rate) lasting until 8 Gyr ago, but very little since then. 
The mean metallicity is measured to be [M/H] = —1.7 ±0.4 
dex. The conclusion of an entirely ancient galaxy is consis¬ 
tent with that obtained by the other study of this data set 
(Grillmair et al. 1998) as well as the ground-based work of 
Carney & Seitzer (1986). 


4.2 Ursa Minor 

The Ursa Minor dwarf spheroidal is another system whose 
CMD contains a moderate number of stars. Its CMD is 
shown in Figure |l^; as with Draco, a narrow turnoff and 
RGB imply a simple and predominantly-old star formation 
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Figure 12. Observed (V - I),V CMD of Ursa Minor; N=1941; Figure 13. Observed (V - I),V CMD of Sculptor; N=819; 

N (My < 4)=172. N(M V < 4)=46. 


history. A few stars exist above the turnoff, implying either 
the presence of blue stragglers or a small young population. 
Note that stars within 0.2 magnitudes of the RGB tip would 
be saturated, but it is unlikely that many such stars exist in 
this small field. 

The measured star formation history is shown in panel 
b of Figure |lTj. As was expected from the visual examina¬ 
tion of the CMD, the only significant star formation episode 
appears to be at ancient ages (> 11 Gyr ago). The mean 
metallicity was measured to be [M/H] = —1.5 ± 0.3 dex. 
The conclusion of an entirely ancient galaxy is consistent 
with that obtained by other studies of this data set (Mighcll 
& Burke 1999, Hernandez et al. 2000) - although care should 
be taken in making too much of this comparison, as Hernan¬ 
dez et al. (2000) had an error of more than —0.1 magnitudes 
in their (V — I) colours - as well as the ground-based work 
of Olszewski & Aaronson (1985). 


4.3 Sculptor 

The extreme case, in terms of numbers of stars, is that of 
the Sculptor dwarf spheroidal. While the Leo II CMD had 
5188 stars brighter than My = +4, Sculptor’s (Figure |l3|) 
has only 46. The CMD cuts off at My = +0.2 because of 
saturation; thus the entire upper RGB is lost in these data. 
Given the small number of evolved stars in the CMD, it is 
clearly impossible to measure the star formation history with 
great precision. However, there the CMD shows no evidence 
of young stars. 

The measured star formation history is shown in panel 
c of Figure |Il]. Consistent with the visual examination of the 
CMD, we again find an ancient galaxy. The constraints on 
the maximum amount of younger star formation are quite 
weak, though, and it is possible to have an acceptable fit 
with any one of the younger low-resolution bins increased to 
the lifetime average star formation rate. The mean metallic¬ 
ity is measured to be [M/H] = —1.5 + 0.6. The conclusion of 
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an entirely ancient galaxy is consistent with that obtained 
by other study of this data set (Monkiewicz et al. 1999) as 
well as the ground-based work of Da Costa (1984). 


4.4 Leo I 


Leo I is the first galaxy in this study to contain young stars. 
Its CMD, shown in Figure [H], shows an extremely broad 
turnoff, ranging from ancient stars to very young stars. The 
main sequence itself is dominated by a young population, 
and a few blue helium burners are present. There is likely 
an HB extending bluewards from the base of the red clump; 
however one cannot be entirely sure that those are HB stars 
rather than young stars evolving off the main sequence. It 
was the only object for which observations were made before 
WFPC2 was cooled; the CTE corrections and calibrations 
are thus somewhat more uncertain than for the other ob¬ 
jects in this sample. Nevertheless, as noted previously, any 
error in the zero points (there is no reason to believe any 
such error exists) would affect merely the distance and ex¬ 
tinction measurements; the recovered star formation history 
is unaffected. 

The goodness-of-fit was by far the worst of the galax¬ 
ies studied. The Q value is 7.47, which is similar to that 
measured in the poorly-binned solution of synthetic galaxy 
1; this may indicate that an “unlucky” choice of time bins. 
Nevertheless, we can use the fact that the difference between 
the fit parameters of the best fit and that of the true fit is 
independent of the quality of the fit - in oth er words, we 
can use the formalism defined in section 2.5 to determine 
the uncertainties. 

The measured star formation history is shown in Figure 
[Tt| . Leo I shows star formation detected at the ler level at 
every age from 15 Gyr ago until 0.5 Gyr ago. The largest 
epoch of star formation occurred recently, from 3 to 1 Gyr 
ago, during which time the star formation rate was 2.5 times 
the lifetime average. The burst appears to have begun and 
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Figure 14. Observed ( V — I),V CMD of Leo I; N=31064; Figure 16. Observed (V — I),V CMD of Carina; N=2772; 

N (M v < 4)=22290. N(Af v < 4)=609. 



Figure 15. Star formation and chemical enrichment histories of 
Leo I. The top panel shows the star formation rate, normalized 
to the liftime average rate of 8.5 X 10— 5Moyr~ 1. The bottom 
panel shows the chemical enrichment history. 


ended quite quickly, as the star formation history is inconsis¬ 
tent with a constant rate from 4 Gyr ago until the present. 
Additionally, ancient (> 11 Gyr) stars are detected at the 
la level, despite confusion caused by the young population. 

Nearly all previous studies using these data (Gallart et 
al. 1999b; Hernandez et al. 2000) have concluded that there 
was a strong episode of star formation recently in Leo I. 
Gallart et al. (1999b) estimated that most of the star for¬ 
mation occurred between 1 Gyr and 7 Gyr ago; Hernandez 
et al. (2000) measured bursts centred near 4 and 7.5 Gyr. 
However, the results of Hernandez et al. (2000) are skewed 
by an error of ~ +0.2 magnitudes in the photometric zero 
points, thus making the stars appear older and explaining 
why the two apparent peaks in my star formation history 
(2 — 2.8 and 4 — 5.7 Gyr) are younger than those in theirs. 

The study of the oldest stars from these data has had 
conflicting results. Gallart et al. (1999b) found a negligi¬ 
ble star formation rate beyond 12 Gyr, while Caputo et al. 
(1999) concluded that such star formation likely exists. As 
noted above, I measure a presence of star formation older 


than 11 Gyr at the 1 <t level. The Gallart et al. (1999b), 
result, however, is based on a very large assumed distance 
modulus of 22.18; bringing Leo I to the distance determined 
here would move some of their 9.4 — 12 Gyr star formation 
into the 12 — 15 Gyr range; the presence of old stars has 
been confirmed by NTT observations of the outer regions of 
Leo I by Held et al. (2000). 

4.5 Carina 

The Carina dwarf spheroidal is another system containing 
relatively young stars. While its younger population does not 
dominate the CMD (Figure [l(]) as much as that of Leo I, 
the significant foreground contamination makes the solution 
more difficult. The main sequence extends to V ~ 22.5, cor¬ 
responding to an absolute magnitude of +2.3. The smaller 
number of stars makes stronger conclusions more difficult, 
of course - the 15 blue helium burners seen in Leo I would 
scale down to 0.4 given the relative numbers of stars in the 
two fields. 

The distance and extinction values from the CMD 
fit, shown in Table are the only ones that disagree 
with semi-empirical values calculated from the CMD. It 
is likely that the foreground contamination and lack of a 
well-photometered lower main sequence made the distance 
and extinction question poorly-constrained. This is a system 
for which incorporation of outside information (such as the 
Schlegel et al. extinction value) would clearly be profitable. 

The measured star formation history is shown in panel 
a of Figure |l7j. Carina appears to show continuous star for¬ 
mation (at the resolution possible) from its earliest star for¬ 
mation episode until roughly 2 Gyr ago. Optimal science can 
only be obtained with a wider-held camera, of course; the 
primary purpose of studying these data is to determine how 
precisely the star formation history can be determined from 
this CMD. 

For comparison, the previous studies of these data 
© 2001 RAS, MNRAS 000, 0 0 










Star Formation History Measurement 17 


! (a) Carina 

1 i 

; 

i ■4-j — ■ ■ — 



- 




; 

! (b) Sagittar 

us inner field 

. 






_ 


! (c) Sagittar 

us outer field 


j 


j 

lL i _ 


i 

i 


0 5 10 15 


Age (Gyr) 

Figure 17. Star formation histories of two mixed-age systems: 
Carina and Sagittarius. Each is normalized relative to its lifetime 
average star formation rate. 

(Mighell 1997, Hernandez et al. 2000) have also noted strong 
intermediate-aged stellar populations. The bulk of stars 
found by Mighell (1997) have ages between 4 and 10 Gyr; 
Hernandez et al. (2000) measured 1 Gyr-wide peaks centred 
at 3, 5, and 8 Gyr. Using ground-based data, Hurley-Keller, 
Mateo, & Nemec (1998) find a 3-burst structure with ages 3, 
7, and 15 Gyr. The present work finds significant amounts of 
stars younger than 4 Gyr, contradicting Mighell (1997) but 
agreeing extremely well with Hurley-Keller et al. (1998). As 
with Leo I, there is some ambiguity as to whether or not 
ancient stars exist. Mighell (1997) and Hurley-Keller et al. 
(1998) found evidence of them, while Hernandez et al. (2000) 
did not. Figure [fl] shows the presence of old (> 8 Gyr) stars 
at the la level. As was the case for Leo II, the presence of 
a strong blue HB (Smecker-Hane et al. 1994) would argue 
against the history proposed by Hernandez et al. (2000), 
who find essentially no star formation older than 10 Gyr. 
A serious (~ —0.2 magnitudes in (V — /)) error in their 
photometric zero point likely causes their spurious result. 

4.6 Sagittarius 

The final galaxy to be examined is the Sagittarius dwarf 
spheroidal. It provides an even more difficult challenge than 
Carina, as the number of stars in the CMD is not signifi¬ 
cantly greater, while there is a tremendous amount of fore¬ 
ground contamination. The CMDs of the central field (0.2° 
from the centre) and outer field (2.4° from the centre) are 
shown in Figures |l8| and [l9| Because of the foreground con¬ 
tamination, only a very rough estimate of the star formation 
can be made - there are no extremely young stars, but the 
central field does appear to have a broad turnoff extending 
up to V ~ 20.4. The turnoff in the outer field does not ex¬ 
tend as high, but is still much broader than those of Ursa 
Minor or Draco. In both fields, the CMD cuts off about 2 
magnitudes below the RGB tip. Given the foreground con¬ 
tamination, however, it is unlikely that Sagittarius RGB 
stars would have been distinguishable from foreground main 
sequence stars. 

Because of the possibility of differences in extinction 
and star formation history, the CMDs of the two fields were 
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Figure 18. Observed (V — I),V CMD of the central Sagittarius 
field; N=6553; N(My < 4)=809. 
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Figure 19. Observed (V — I),V CMD of the outer Sagittarius 
field; N=3329; N(My < 4)=408. 


fit separately. The measured star formation histories are 
shown in panels b and c of Figure [T?]. The central field shows 
measurable star formation until ~ 2 Gyr ago, while the outer 
field shows only about half the number of young (< 8 Gyr) 
stars. Sagittarius is also the only system for which signifi¬ 
cant chemical enrichment is measured. The inner field shows 
a metallicity of [M/H] = —1.1 ±0.4 dex at ages older than 8 
Gyr, which increased to [M/H] = 0.0 ± 0.2 for the youngest 
large population of stars (2 — 4 Gyr old). The history seen in 
the outer field is consitent, but the uncertainties are larger 
because of the smaller number of stars. 

No literature data currently exists using these WFPC2 
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Table 3. Summary of Results 


Quantity 

Ursa Minor 

Draco 

Sculptor 

Leo II 

Sagittarius 

Carina 

Leo I 

<t> (Gyr) 

12.7 

12.2 

11.5 

9.4 

9.0 

7.2 

6.1 

at (Gyr) 

1.9 

2.3 

3.1 

3.1 

3.7 

3.3 

4.1 

«M/Hr 

-1.5 ±0.3 

-1.8 ±0.4 

-1.5 ±0.6 

—1.1 ± 0.3 

-0.6 ±0.4 

-1.2 ±0.4 

-1.0 ±0.2 


“Metallicities are given on the scale of the Girardi et al. (2000) isochrones; [M/H] = log(Z/0.02). 


images; however a few ground-based studies have been car¬ 
ried out (Layden & Sarajedini 1997; Marconi et al. 1998; and 
Bellazzini et al. 1999a). Marconi et al. (1998) and Bellazzini 
et al. (1999b) found very large metallicity dispersions, con¬ 
sistent with the significant metallicity evolution measured 
in this work. The star formation history is agreed to be ex¬ 
tended, with a peak age agreed to fall between 8 and 11 Gyr; 
this compares favorably with the mean ages measured here 
(8.6 Gyr in the inner field and 9.8 Gyr in the outer field). 


5 CONCLUSIONS 

An examination of the technique for measuring star forma¬ 
tion histories has been presented. While the underlying con¬ 
cept - finding the star formation history most likely to have 
produced the observed data - is straightforward, there are 
a number of potential traps that must be overcome. In gen¬ 
erating synthetic CMDs, one must take care to ensure that 
all possible outcomes have been sampled; this is not done 
by “random drawing” techniques in which a certain number 
of stars are randomly drawn and placed on the CMD. In¬ 
stead, it is necessary to make a true model CMD - a CMD 
that represents the probability distribution from which the 
data could have been drawn. The first step is to make fine 
interpolations of the isochrones in age, metallicity, and mass 
so that all possible single stars are accounted for. One must 
also account for the possibility of binaries by considering 
a number of possible secondary passes sufficiently large to 
create a smooth model CMD. The final step in generating a 
partial CMD (CMD for a small range of age and metallic¬ 
ity) is to apply the results of artificial star tests. The model 
CMD can be generated from any combination of the partial 
CMDs (different combinations correspond to different star 
formation histories), plus a model of foreground contamina¬ 
tion and models of bad detections. 

I have demonstrated the inadequacy of a \ 2 minimiza¬ 
tion when fitting Poisson-distributed data (as is the case 
here). Specifically, a \ 2 minimization will always minimize 
with the wrong star formation history; the only question is 
how wrong the answer will be. Instead, a Poisson likelihood 
ratio is recommended, the equation given in equation |lf)| It 
has also been demonstrated that the “Saha W” (Saha 1998) 
statistic is not designed for model-data comparisons. The 
Bayesian inference scheme of Tolstoy & Saha (1996) pro¬ 
vides an accurate solution of relative star formation rates 
but not the overall mean star formation rate. The question 
of binning vs. non-binning is demonstrated to be unimpor¬ 
tant, as the same star formation rate will be obtained so 
long as the bin sizes are as small as the smallest features 
of the model CMD. Finally, techniques for measuring un¬ 
certainties and determining the overall fit quality are given, 


as well as a method in which outside data (such as a red 
giant metallicity distribution) can be incorporated into the 
fit without use of a prior. 

The technique was then applied to a pair of syn¬ 
thetic galaxies - one single-population and one composite- 
population. The star formation history of the single¬ 
population system was measured with an age accuracy of 
±0.03 dex and distance and extinction accuracy of 0.02 
magnitudes, provided that at least the RGB and HB were 
included in the data (depth of My = +2). Most of the 
constraints were lost, however, when restricting the solu¬ 
tion to only the upper RGB (depth of My = 0); this in¬ 
troduced an age uncertainty of ±0.2 dex into the solution. 
Although the quality of the fit was severely degraded when 
using an intentionally-wrong set of age bins, we note that the 
measured distance, extinction, and star formation history 
were all correct. The star formation history of the synthetic 
composite-population system was measured with less accu¬ 
racy, with resolution of roughly ±0.07 dex producing reason¬ 
able signal-to-noise with photometric depth of My = ±2. 
However, the solution with a photometric limit of My = 0 
was again very uncertain, with age resolution degraded to 
±0.25 dex. The quoted resolutions, of course, are dependent 
upon the number of stars in the observed field; the uncer¬ 
tainties scale as 1/VN. 

Finally, I showed measurements of the star formation 
histories of seven dwarf spheroidal companions. While each 
data set had a different quality (number of stars, photomet¬ 
ric depth, and amount of foreground contamination), the 
ability to accurately measure uncertainties allows one to give 
the best answer and the uncertainty in the measurement for 
each object. Thus the star formation history can always be 
measured - even with the very poor Sculptor CMD - but 
better data will naturally result in smaller uncertainties. 

The technique-related findings of this study can be sum¬ 
marized as follows: 

(i) In every case, the calculated star formation history 
matched with the qualitative star formation history ob¬ 
tained by a cursory examination of the CMD. In nearly ev¬ 
ery case, the distance and extinction were consistent with 
literature values. 

(ii) The number of stars with My < ±4 required to pro¬ 
duce results with signal-to-noise > 1 at moderate resolution 
appears to be about 150 for an old system and 500-1000 for 
a system with many young stars. 

(iii) Even with the uncertainties in the isochrones, all 
CMDs were well-fit. The largest Xeff was 1-16, and only 
the Leo I fit was worse than 2.5 a from an ideal solution. 

(iv) In the case of the Sagittarius dwarf, a large amount 
of foreground contamination (more foreground stars than 
Sagittarius stars) does not add significantly to the fit uncer- 
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tainties. This is likely because the main sequence and MSTO 
of Sagittarius are sufficiently separated from the bulge main 
sequence. 

Scientifically, the results are limited by the fact that 
only a small fraction of each galaxy was studied. The Leo 
spheroidals had sufficient numbers of stars for accurate star 
formation history measurements; the others produced only 
rough star formation histories. The consistent feature of the 
star formation histories is that ancient (> 8 Gyr) star for¬ 
mation was detected in all eight CMDs at the lcr level. After 
the ancient burst, some (Ursa Minor, Draco, and Sculptor) 
show no evidence of young star formation. Leo II shows star 
formation covering about half its lifetime, while Carina and 
Sagittarius appear to have formed stars until ~ 2 Gyr ago. 
Finally, Leo I shows a very strong young burst, with its star 
formation rate 2 — 3 Gyr ago nearly four times its lifetime 
average. Results for the galaxies are summarized in Table [i| 
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